How to Calculate McNemar’s Test to Compare Two Machine Learning Classifiers
Statistical Hypothesis Tests for Deep Learning
Contingency Table(分割表)
McNemar’s Test Statistic
統計量:(Yes/No - No/Yes)^2 / (Yes/No + No/Yes)
機械学習モデルではclassifier1のcorrect/wrongとclassifier2のcorrect/wrongに置き換えられそう
Given the selection of a significance level, the p-value calculated by the test can be interpreted as follows:
p > alpha: fail to reject H0, no difference in the disagreement (e.g. treatment had no effect).
p <= alpha: reject H0, significant difference in the disagreement (e.g. treatment had an effect).
Interpret the McNemar’s Test for Classifiers
1. No Measure of Training Set or Model Variability
2. Less Direct Comparison of Models
McNemar’s Test in Python
statsmodelを使った実装